    Dataflow Restructuring for Active Memory Reduction in Deep Neural Networks

    The volume reduction of the activation maps produced by the hidden layers of a Deep Neural Network (DNN) is a critical aspect in modern applications as it affects the on-chip memory utilization, the most limited and costly hardware resource. Despite the availability of many compression methods that leverage the statistical nature of deep learning to approximate and simplify the inference model, e.g., quantization and pruning, there is room for deterministic optimizations that instead tackle the problem from a computational view. This work belongs to this latter category as it introduces a novel method for minimizing the active memory footprint. The proposed technique, which is data-, model-, compiler-, and hardware-agnostic, does implement a functional-preserving, automated graph restructuring where the memory peaks are suppressed and distributed over time, leading to flatter profiles with less memory pressure. Results collected on a representative class of Convolutional DNNs with different topologies, from Vgg16 and SqueezeNetV1.1 to the recent MobileNetV2, ResNet18, and InceptionV3, provide clear evidence of applicability, showing remarkable memory savings (62.9% on average) with low computational overhead (8.6% on average)

    On the Efficiency of Sparse-Tiled Tensor Graph Processing for Low Memory Usage

    The memory space taken to host and process large tensor graphs is a limiting factor for embedded ConvNets. Even though many data-driven compression pipelines have proven their efficacy, this work shows there is still room for optimization at the intersection with compute-oriented optimizations. We demonstrate that tensor pruning via weight sparsification can cooperate with a model-agnostic tiling strategy, leading ConvNets towards a new feasible region of the solution space. The collected results show for the first time fast versions of MobileNets deployed at full scale on an ARM M7 core with 512KB of RAM and 2MB of FLASH memory

    Axp: A hw-sw co-design pipeline for energy-efficient approximated convnets via associative matching

    The reduction in energy consumption is key for deep neural networks (DNNs) to ensure usability and reliability, whether they are deployed on low-power end-nodes with limited resources or high-performance platforms that serve large pools of users. Leveraging the over-parametrization shown by many DNN models, convolutional neural networks (ConvNets) in particular, energy efficiency can be improved substantially preserving the model accuracy. The solution proposed in this work exploits the intrinsic redundancy of ConvNets to maximize the reuse of partial arithmetic results during the inference stages. Specifically, the weight-set of a given ConvNet is discretized through a clustering procedure such that the largest possible number of inner multiplications fall into predefined bins; this allows an off-line computation of the most frequent results, which in turn can be stored locally and retrieved when needed during the forward pass. Such a reuse mechanism leads to remarkable energy savings with the aid of a custom processing element (PE) that integrates an associative memory with a standard floating-point unit (FPU). Moreover, the adoption of an approximate associative rule based on a partial bit-match increases the hit rate over the pre-computed results, maximizing the energy reduction even further. Results collected on a set of ConvNets trained for computer vision and speech processing tasks reveal that the proposed associative-based hw-sw co-design achieves up to 77% in energy savings with less than 1% in accuracy loss

    Fast and Accurate Inference on Microcontrollers with Boosted Cooperative Convolutional Neural Networks (BC-Net)

    Arithmetic precision scaling is mandatory to deploy Convolutional Neural Networks (CNNs) on resource-constrained devices such as microcontrollers (MCUs), and quantization via fixed-point or binarization are the most adopted techniques today. Despite being born by the same concept of bit-width lowering, these two strategies differ substantially each other, and hence are often conceived and implemented separately. However, their joint integration is feasible and, if properly implemented, can bring to large savings and high processing efficiency. This work elaborates on this aspect introducing a boosted collaborative mechanism that pushes CNNs towards higher performance and more predictive capability. Referred as BC-Net, the proposed solution consists of a self-adaptive conditional scheme where a lightweight binary net and an 8-bit quantized net are trained to cooperate dynamically. Experiments conducted on four different CNN benchmarks deployed on off-the-shelf boards powered with the MCUs of the Cortex-M family by ARM show that BC-Nets outperform classical quantization and binarization when applied as separate techniques (up to 81.49% speed-up and up to 3.8% of accuracy improvement). The comparative analysis with a previously proposed cooperative method also demonstrates BC-Nets achieve substantial savings in terms of both performance (+19%) and accuracy (+3.45%)

    Energy-efficient and Privacy-aware Social Distance Monitoring with Low-resolution Infrared Sensors and Adaptive Inference

    Low-resolution infrared (IR) Sensors combined with machine learning (ML) can be leveraged to implement privacy-preserving social distance monitoring solutions in indoor spaces. However, the need of executing these applications on Internet of Things (IoT) edge nodes makes energy consumption critical. In this work, we propose an energy-efficient adaptive inference solution consisting of the cascade of a simple wake-up trigger and a 8-bit quantized Convolutional Neural Network (CNN), which is only invoked for difficult-to-classify frames. Deploying such adaptive system on a IoT Microcontroller, we show that, when processing the output of a 8×8 low-resolution IR sensor, we are able to reduce the energy consumption by 37-57% with respect to a static CNN-based approach, with an accuracy drop of less than 2% (83% balanced accuracy)

    Enabling monocular depth perception at the very edge

    Depth estimation is crucial in several computer vision applications, and a recent trend aims at inferring such a cue from a single camera through computationally demanding CNNs - precluding their practical deployment in several application contexts characterized by low-power constraints. Purposely, we develop a tiny network tailored to microcontrollers, processing low-resolution images to obtain a coarse depth map of the observed scene. Our solution enables depth perception with minimal power requirements (a few hundreds of mW), accurately enough to pave the way to several high-level applications at-the-edge

    Ultra-compact binary neural networks for human activity recognition on RISC-V processors

    Human Activity Recognition (HAR) is a relevant inference task in many mobile applications. State-of-the-art HAR at the edge is typically achieved with lightweight machine learning models such as decision trees and Random Forests (RFs), whereas deep learning is less common due to its high computational complexity. In this work, we propose a novel implementation of HAR based on deep neural networks, and precisely on Binary Neural Networks (BNNs), targeting low-power general purpose processors with a RISC-V instruction set. BNNs yield very small memory footprints and low inference complexity, thanks to the replacement of arithmetic operations with bit-wise ones. However, existing BNN implementations on general purpose processors impose constraints tailored to complex computer vision tasks, which result in over-parametrized models for simpler problems like HAR. Therefore, we also introduce a new BNN inference library, which targets ultra-compact models explicitly. With experiments on a single-core RISC-V processor, we show that BNNs trained on two HAR datasets obtain higher classification accuracy compared to a state-of-the-art baseline based on RFs. Furthermore, our BNN reaches the same accuracy of a RF with either less memory (up to 91%) or more energy-efficiency (up to 70%), depending on the complexity of the features extracted by the RF

    Indagine immunologica in una coorte di pazienti pediatrici affetti da sindrome di DiGeorge: correlazioni con lo stato vitaminico D.

    La sindrome di DiGeorge (SDG) è causata da una microdelezione del braccio lungo del cromosoma 22. E’ stata inizialmente osservata dall’endocrinologo italo-americano Angelo DiGeorge in un gruppo di bambini che presentavano un quadro clinico comune, caratterizzato da malformazioni cardiache, dismorfismi facciali, convulsioni neonatali dovute ad ipocalcemia conseguente all’ipoparatiroidismo, infezioni ricorrenti dovute all’aplasia del timo e delle paratiroidi. L'ipocalcemia, osservata frequentemente nel periodo neonatale, di solito scompare, ma alcuni bambini possono presentare una persistenza dell'ipoparatiroidismo, che rende necessario un trattamento con calcio e vitamina D. E’ anche descritta recentemente un’incrementata incidenza di malattie autoimmuni, la cui eziopatogenesi ad oggi non acclarata, sembra essere di natura multifattoriale. Il quadro delle anomalie immunologiche è molto ampio e può variare da un normale profilo immunologico ad una completa assenza di linfociti tale da necessitare di un trapianto di timo o di midollo osseo. A causa di questa variabilità, i pazienti vengono classificati come “SDG completi”, quando presentano aplasia timica associata a severa o completa linfopenia T (≤ 1,5% dei pazienti) e “SDG parziali” quando mostrano invece ipoplasia timica. In questo gruppo di pazienti il difetto immunologico è lieve/moderato e può includere difetti numerici e/o funzionali dei linfociti T, come pure difetti dell’immunità umorale. Alterazioni quantitative e/o funzionali delle cellule dendritiche (DC) sono state identificate in diverse condizioni patologiche (allergie, patologie autoimmuni, tumori, ecc.) tuttavia il loro coinvolgimento nella SDG non è ancora stato studiato. Le DC per le loro proprietà funzionali svolgono un ruolo fondamentale nella comunicazione tra immunità innata e adattativa. Esse rappresentano la più importante famiglia di cellule presentanti l’antigene, capaci di innescare efficientemente la risposta di cellule T naive, memory ed effettrici; sono inoltre coinvolte nel mantenimento della tolleranza. Le DC circolanti possono essere distinte, sulla base di un diverso profilo fenotipico e funzionale, in due sottotipi, mieloidi (mDCs) e plasmacitoidi (pDCs), capaci di esercitare effetti complementari sulle cellule T; infatti, mentre le mDCs sono efficienti APCs, le pDCs sono implicate nella tolleranza immunitaria. Recentemente sono stati riconosciuti molteplici effetti extra-scheletrici della vitamina D ed in particolare quelli sul sistema immunitario in considerazione della presenza del suo recettore (VDR) su molte cellule del sistema immunitario (linfociti T attivati CD4 e CD8, linfociti B, neutrofili, DC, macrofagi). La 1,25-diidrossivitamina D (1,25(OH)2D3) agisce sulle cellule T, sia direttamente che indirettamente andando a regolare la produzione di citochine determinanti per il differenziamento linfocitario. Gli effetti inibitori diretti della 1,25(OH)2D3 sono più pronunciati a livello delle cellule T della memoria ed effettrici, in quanto queste cellule hanno un maggior numero di VDR sulla loro superficie rispetto alle cellule naive. La 1,25(OH)2D3 diminuisce la proliferazione, differenziazione e produzione aniticorpale dei linfociti B e ne aumenta l’apoptosi. Relativamente agli effetti sulle DC, studi in vitro hanno dimostrato che la 1,25(OH)2D3 ne inibisce la maturazione e differenziazione, portando alla formazione di cellule coinvolte in meccanismi di tolleranza ed inibendo la formazione di cellule coinvolte in meccanismi di difesa. L’effetto della 1,25(OH)2D3 su i due sottotipi di DC (mieloidi e plasmacitoidi) non è equivalente poiché essa regola, in modo preferenziale, il sottotipo delle mDCs, con conseguente soppressione dell’attivazione delle cellule T naive. Numerosi studi hanno riscontrato una correlazione inversa fra i livelli di vitamina D e l’incidenza di alcune malattie come: patologie infettive (tubercolosi, influenza, HIV), malattie autoimmuni (sclerosi multipla, diabete mellito tipo I, malattie reumatiche, psoriasi, malattie infiammatorie croniche intestinali), asma, patologie cardiovascolari e neoplasie. Scopi dello studio: Poichè l’immunodeficienza riscontrata nella SDG è causata dall’ipoplasia timica, la maggior parte degli studi ha focalizzato l’attenzione sul numero e funzione dei linfociti T. Tuttavia lo studio di questi parametri dell’immunità cellulo-mediata non si è rivelato esaustivo e non ha permesso di identificare i pazienti con SDG a maggior rischio di sviluppare infezioni severe e/o complicanze autoimmuni. Pertanto questo studio ha avuto come obiettivo un approfondimento di indagine immunologica dei pazienti con SDG, con particolare attenzione alla valutazione delle cellule dendritiche. In considerazione dei riconosciuti effetti immunomodulanti della vitamina D, la determinazione della vitamina D è stata effettuata in tutta la coorte dei pazienti studiati; essi sono stati inoltre suddivisi in gruppi, sulla base dello stato vitaminico e della presenza di ipoparatiroidismo, al fine di valutare se fosse possibile identificare correlazioni tra livelli sierici di vitamina D e deficit immune. Infine, il gruppo dei pazienti deficitari è stato sottoposto a supplementazione e successiva rivalutazione immunologica, al fine di evidenziare gli eventuali effetti della vitamina D sul sistema immunitario

    Nanoionics-Based Three-Terminal Synaptic Device Using Zinc Oxide

    Artificial synaptic thin film transistors (TFTs) capable of simultaneously manifesting signal transmission and self-learning are demonstrated using transparent zinc oxide (ZnO) in combination with high κ tantalum oxide as gate insulator. The devices exhibit pronounced memory retention with a memory window in excess of 4 V realized using an operating voltage less than 6 V. Gate polarity induced motion of oxygen vacancies in the gate insulator is proposed to play a vital role in emulating synaptic behavior, directly measured as the transmission of a signal between the source and drain (S/D) terminals, but with the added benefit of independent control of synaptic weight. Unlike in two terminal memristor/resistive switching devices, multistate memory levels are demonstrated using the gate terminal without hampering the signal transmission across the S/D electrodes. Synaptic functions in the devices can be emulated using a low programming voltage of 200 mV, an order of magnitude smaller than in conventional resistive random access memory and other field effect transistor based synaptic technologies. Robust synaptic properties demonstrated using fully transparent, ecofriendly inorganic materials chosen here show greater promise in realizing scalable synaptic devices compared to organic synaptic and other liquid electrolyte gated device technologies. Most importantly, the strong coupling between the in-plane gate and semiconductor channel through ionic charge in the gate insulator shown by these devices, can lead to an artificial neural network with multiple presynaptic terminals for complex synaptic learning processes. This provides opportunities to alleviate the extreme requirements of component and interconnect density in realizing brainlike systems

    Adjustment of interbank lending in pre-and post-regulation periods: Empirical analysis of Vietnamese commercial banks

    This article aims to analyze the interbank lending adjustment during the period when Vietnamese commercial banks are compliant with some parts of the Basel regulation framework. A pilot regulation period has started in 2011 and full application will be effective by the end of 2018. Partial adjustment models and variance decomposition are used for the analyses. In the analysis of the quarterly released financial statements of Vietnamese commercial banks in the period from 2008/Q1 to 2015/Q4, the empirical evidence showed that throughout the period, lending to non-bank and high liquidity assets contributed to the adjustment in both long run and short run with a negative association. In addition, the loan loss allowance contributed to the adjustment in the post-regulation period only with a positive association. These highly contributing factors also show a potential shock after the adjustment of the interbank lending. The results imply there is a need for interbank lending portfolio report and an efficient control over the IRB of Vietnamese commercial banks. © Foundation of International Studies, 2017 & CSR, 2017